Face Synthesis Driven by Audio Speech Input Based on Hmms

نویسندگان

Ling SUN

Wei LAI

Ren-Hua WANG

چکیده

In this paper, a HMM-based visual speech system driven by audio speech input is designed to render a face model while synchronous audio is played. Compared to many methods adopted by other researchers, there is much difference between our approach and theirs. We first train the models for every final and initial in mandarin. In this process, a large quantity of audio training data under different surroundings and spoken by different people are used. Then, the recorded synchronous audiovisual speech data are used to make the trained models more adaptive to our specific announcer. Such models are more robust in synthesis phase and satisfying performance can be achieved even when input audio speech is degraded by noises.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visual Speech Synthesis Based on Parameter Generation From HMM: Speech-Driven and Text-And-Speech-Driven Approaches

This paper describes a technique for synthesizing synchronized lip movements from auditory input speech signal. The technique is based on an algorithm for parameter generation from HMM with dynamic features, which has been successfully applied to text-to-speech synthesis. Audio-visual speech unit HMMs, namely, syllable HMMs are trained with parameter vector sequences that represent both auditor...

متن کامل

Synthetic visual speech driven from auditory speech

We have developed two different methods for using auditory, telephone speech to drive the movements of a synthetic face. In the first method, Hidden Markov Models (HMMs) were trained on a phonetically transcribed telephone speech database. The output of the HMMs was then fed into a rulebased visual speech synthesizer as a string of phonemes together with time labels. In the second method, Artif...

متن کامل

Two- and Three-Dimensional Audio-Visual Speech Synthesis

An audio-visual speech synthesiser has been built that will generate animated computer-graphics displays of high-resolution, colour images of a speaker's mouth area. The visual displays can simulate the movements of the lower face of a talker for any spoken sentence of British English, given a text input. The synthesiser is based on a data-driven technique. It uses encoded, video-recorded image...

متن کامل

Speaker-independent 3D face synthesis driven by speech and text

In this study, a complete system that generates visual speech by synthesizing 3D face points has been implemented. The estimated face points drive MPEG-4 facial animation. This system is speaker independent and can be driven by audio or both audio and text. The synthesis of visual speech was realized by a codebook-based technique, which is trained with audio-visual data from a speaker. An audio...

متن کامل

Speech-to-lip movement synthesis based on the EM algorithm using audio-visual HMMs

This paper proposes a method to re-estimate output visual parameters for speech-to-lip movement synthesis using audio-visual Hidden Markov Models(HMMs) under the Expectation-Maximization(EM) algorithm. In the conventional methods for speech-to-lip movement synthesis, there is a synthesis method estimating a visual parameter sequence through the Viterbi alignment of an input acoustic speech sign...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2002

Face Synthesis Driven by Audio Speech Input Based on Hmms

نویسندگان

چکیده

منابع مشابه

Visual Speech Synthesis Based on Parameter Generation From HMM: Speech-Driven and Text-And-Speech-Driven Approaches

Synthetic visual speech driven from auditory speech

Two- and Three-Dimensional Audio-Visual Speech Synthesis

Speaker-independent 3D face synthesis driven by speech and text

Speech-to-lip movement synthesis based on the EM algorithm using audio-visual HMMs

عنوان ژورنال:

اشتراک گذاری